Pythonã䜿çšãããããªå§çž®ã¢ã«ãŽãªãºã ã®çè§£ãšå®è£ ã®ããã®å æ¬çãªã¬ã€ããçŸä»£ã®ãããªã³ãŒããã¯ã®èåŸã«ããçè«ãšå®è·µãåŠã³ãŸãã
Pythonã§ãããªã³ãŒããã¯ãæ§ç¯ããïŒå§çž®ã¢ã«ãŽãªãºã ã®æ·±æã
é«åºŠã«æ¥ç¶ãããäžçã§ã¯ããããªã䞻圹ã§ããã¹ããªãŒãã³ã°ãµãŒãã¹ããããªäŒè°ãããœãŒã·ã£ã«ã¡ãã£ã¢ãã£ãŒããŸã§ãããžã¿ã«ãããªãã€ã³ã¿ãŒããããã©ãã£ãã¯ãæ¯é ããŠããŸããããããã©ã®ããã«ããŠé«è§£å床æ ç»ãæšæºçãªã€ã³ã¿ãŒãããæ¥ç¶ã§éä¿¡ã§ããã®ã§ããããããã®çãã¯ãé åçã§è€éãªåéããããªå§çž®ã«ãããŸãããã®ãã¯ãããžãŒã®äžå¿ã«ããã®ã¯ããããªã³ãŒããã¯ïŒCOder-DECoderïŒã§ãããèŠèŠçãªå質ãç¶æããªãããã¡ã€ã«ãµã€ãºãå€§å¹ ã«åæžããããã«èšèšãããé«åºŠãªã¢ã«ãŽãªãºã ã®ã»ããã§ãã
H.264ãHEVCïŒH.265ïŒãããã³ãã€ã€ãªãã£ããªãŒã®AV1ã®ãããªæ¥çæšæºã®ã³ãŒããã¯ã¯ãä¿¡ããããªãã»ã©è€éãªãšã³ãžãã¢ãªã³ã°ã®ææã§ããããã®åºæ¬çãªååãçè§£ããããšã¯ãææ¬²çãªéçºè ãªã誰ã§ãå¯èœã§ãããã®ã¬ã€ãã§ã¯ããããªå§çž®ã®äžçãžã®æ·±ãæ ã«ãæ¡å ããŸããçè«ã«ã€ããŠè©±ãã ãã§ãªããPythonã䜿çšããŠãç°¡ç¥åãããæè²çšãããªã³ãŒããã¯ããŒãããæ§ç¯ããŸãããã®å®è·µçãªã¢ãããŒãã¯ãçŸä»£ã®ãããªã¹ããªãŒãã³ã°ãå¯èœã«ãããšã¬ã¬ã³ããªã¢ã€ãã¢ãææ¡ããããã®æè¯ã®æ¹æ³ã§ãã
ãªãPythonãªã®ãïŒãªã¢ã«ã¿ã€ã ã§é«æ§èœãªåçšã³ãŒããã¯ïŒéåžžã¯C/C++ãŸãã¯ã¢ã»ã³ããªã§èšè¿°ïŒã«äœ¿çšããèšèªã§ã¯ãããŸããããPythonã®èªã¿ããããšãNumPyãSciPyãOpenCVãªã©ã®åŒ·åãªã©ã€ãã©ãªã«ãããåŠç¿ããããã¿ã€ãã³ã°ãããã³ç ç©¶ã«æé©ãªç°å¢ã§ããäœã¬ãã«ã®ã¡ã¢ãªç®¡çã«ç ©ããããããšãªããã¢ã«ãŽãªãºã ã«éäžã§ããŸãã
ãããªå§çž®ã®ã³ã¢ã³ã³ã»ãããçè§£ãã
ã³ãŒãã1è¡ãæžãåã«ãäœãéæããããšããŠããã®ããçè§£ããå¿ èŠããããŸãããããªå§çž®ã®ç®æšã¯ãåé·ãªããŒã¿ãæé€ããããšã§ããçã®ãå§çž®ãããŠããªããããªã¯å·šå€§ã§ãã1åéã®1080pãããªïŒ30ãã¬ãŒã /ç§ïŒã¯7 GBãè¶ ããå¯èœæ§ããããŸãããã®ããŒã¿ã®ç£ã飌ããªããããã«ã2ã€ã®äž»èŠãªã¿ã€ãã®åé·æ§ãå©çšããŸãã
å§çž®ã®2ã€ã®æ±ïŒç©ºéçåé·æ§ãšæéçåé·æ§
- 空éçïŒãã¬ãŒã å ïŒåé·æ§ïŒããã¯ãåäžãã¬ãŒã å ã®åé·æ§ã§ããéã空ã®åºãããããçœãå£ãèããŠã¿ãŠãã ããããã®é åå ã®ãã¹ãŠã®åäžãã¯ã»ã«ã®è²ã®å€ãä¿åãã代ããã«ãããå¹ççã«èšè¿°ã§ããŸããããã¯ãJPEGã®ãããªç»åå§çž®åœ¢åŒã®èåŸã«ããã®ãšåãåçã§ãã
- æéçïŒãã¬ãŒã éïŒåé·æ§ïŒããã¯ãé£ç¶ãããã¬ãŒã éã®åé·æ§ã§ããã»ãšãã©ã®ãããªã§ã¯ãã·ãŒã³ã¯1ã€ã®ãã¬ãŒã ããæ¬¡ã®ãã¬ãŒã ã«å®å šã«å€åããŸãããããšãã°ãéçãªèæ¯ã«å¯ŸããŠè©±ããŠãã人ã¯ãèšå€§ãªéã®æéçåé·æ§ãæã£ãŠããŸããèæ¯ã¯åããŸãŸã§ããç»åã®ããäžéšïŒäººã®é¡ãšäœïŒã ããåããŸããããã¯ããããªã§æãéèŠãªå§çž®æºã§ãã
ããŒãã¬ãŒã ã¿ã€ãïŒIãã¬ãŒã ãPãã¬ãŒã ãBãã¬ãŒã
æéçåé·æ§ãå©çšããããã«ãã³ãŒããã¯ã¯ãã¹ãŠã®ãã¬ãŒã ãå¹³çã«æ±ããŸããããããã¯ç°ãªãã¿ã€ãã«åé¡ããããã¯ãã£ã°ã«ãŒãïŒGOPïŒãšåŒã°ããã·ãŒã±ã³ã¹ã圢æããŸãã
- Iãã¬ãŒã ïŒã€ã³ãã©ã³ãŒãåãã¬ãŒã ïŒïŒIãã¬ãŒã ã¯ãå®å šãªèªå·±å®çµåã®ç»åã§ããJPEGãšåæ§ã«ã空éçåé·æ§ã®ã¿ã䜿çšããŠå§çž®ãããŸããIãã¬ãŒã ã¯ããããªã¹ããªãŒã å ã®ã¢ã³ã«ãŒãã€ã³ããšããŠæ©èœããèŠèŽè ãåçãéå§ããããæ°ããäœçœ®ã«ã·ãŒã¯ãããã§ããããã«ããŸãããããã¯æå€§ã®ãã¬ãŒã ã¿ã€ãã§ããããããªãåçæããããã«äžå¯æ¬ ã§ãã
- Pãã¬ãŒã ïŒäºæž¬ãã¬ãŒã ïŒïŒPãã¬ãŒã ã¯ãåã®Iãã¬ãŒã ãŸãã¯Pãã¬ãŒã ãåç §ããŠãšã³ã³ãŒããããŸããç»åå šäœãä¿åãã代ããã«ãéãã®ã¿ãä¿åããŸããããšãã°ããæåŸã®ãã¬ãŒã ãããã®ãã¯ã»ã«ã®ãããã¯ãååŸããå³ã«5ãã¯ã»ã«ç§»åããããã«ããããªè²ã®å€æŽããããŸããã®ãããªåœä»€ãä¿åããŸããããã¯ãåãæšå®ãšåŒã°ããããã»ã¹ãéããŠå®çŸãããŸãã
- Bãã¬ãŒã ïŒåæ¹åäºæž¬ãã¬ãŒã ïŒïŒBãã¬ãŒã ã¯æãå¹ççã§ããäºæž¬ã®åç §ãšããŠãåãšæ¬¡ã®äž¡æ¹ã®ãã¬ãŒã ã䜿çšã§ããŸããããã¯ããªããžã§ã¯ããäžæçã«é衚瀺ã«ãªããå床衚瀺ãããã·ãŒã³ã«åœ¹ç«ã¡ãŸããååŸã«åç §ããããšã«ãããã³ãŒããã¯ã¯ããæ£ç¢ºã§ããŒã¿å¹çã®é«ãäºæž¬ãäœæã§ããŸãããã ããå°æ¥ã®ãã¬ãŒã ã䜿çšãããšãããããªé å»¶ïŒã¬ã€ãã³ã·ïŒãçºçããããããããªé話ã®ãããªãªã¢ã«ã¿ã€ã ã¢ããªã±ãŒã·ã§ã³ã«ã¯ããŸãé©ããŠããŸããã
äžè¬çãªGOPã¯ãI B B P B B P B B I ...ã®ããã«ãªããŸãããšã³ã³ãŒããŒã¯ãå§çž®å¹çãšã·ãŒã¯å¯èœæ§ã®ãã©ã³ã¹ãåãããã«ãæé©ãªãã¬ãŒã ãã¿ãŒã³ã決å®ããŸãã
å§çž®ãã€ãã©ã€ã³ïŒã¹ããããã€ã¹ãããã®å èš³
ææ°ã®ãããªãšã³ã³ãŒãã£ã³ã°ã¯ããã«ãã¹ããŒãžã®ãã€ãã©ã€ã³ã§ããåã¹ããŒãžã¯ãããŒã¿ãããå§çž®ããããããããã«å€æããŸããåäžã®ãã¬ãŒã ããšã³ã³ãŒãããããã®äž»èŠãªã¹ããããèŠãŠãããŸãããã

ã¹ããã1ïŒã«ã©ãŒã¹ããŒã¹å€æïŒRGBããYCbCrïŒ
ã»ãšãã©ã®ãããªã¯ãRGBïŒèµ€ãç·ãéïŒã«ã©ãŒã¹ããŒã¹ã§éå§ãããŸãããã ãã人éã®ç®ã¯ãè²ã®å€åïŒåœ©åºŠïŒãããèŒåºŠïŒã«ãïŒã®å€åã«ã¯ããã«ææã§ããã³ãŒããã¯ã¯ãRGBãYCbCrã®ãããªã«ã/圩床圢åŒã«å€æããããšã«ããããããæŽ»çšããŸãã
- YïŒã«ãæåïŒèŒåºŠïŒã
- CbïŒéè²ã®å·®ã®åœ©åºŠæåã
- CrïŒèµ€è²ã®å·®ã®åœ©åºŠæåã
èŒåºŠãè²ããåé¢ããããšã«ããã圩床ãµããµã³ããªã³ã°ãé©çšã§ããŸãããã®ææ³ã¯ãç®ã«ãšã£ãŠæãææãªèŒåºŠãã£ãã«ïŒYïŒã®ãã«è§£å床ãç¶æããªãããã«ã©ãŒãã£ãã«ïŒCbããã³CrïŒã®è§£å床ãäœäžãããŸããäžè¬çãªã¹ããŒã ã¯4:2:0ã§ãå質ã®ç¥èŠå¯èœãªäœäžãã»ãšãã©äŒŽããã«ãã«ã©ãŒæ å ±ã®75ïŒ ãç Žæ£ããç¬æã®å§çž®ãå®çŸããŸãã
ã¹ããã2ïŒãã¬ãŒã ããŒãã£ã·ã§ãã³ã°ïŒãã¯ããããã¯ïŒ
ãšã³ã³ãŒããŒã¯ããã¬ãŒã å šäœãäžåºŠã«åŠçããŸããããã¬ãŒã ãããå°ããªãããã¯ïŒéåžžã¯16x16ãŸãã¯8x8ãã¯ã»ã«ïŒã«åå²ããŸãããããã¯ãã¯ããããã¯ãšåŒã°ããŸããåŸç¶ã®ãã¹ãŠã®åŠçã¹ãããïŒäºæž¬ã倿ãªã©ïŒã¯ããããã¯åäœã§å®è¡ãããŸãã
ã¹ããã3ïŒäºæž¬ïŒã€ã³ã¿ãŒããã³ã€ã³ãã©ïŒ
ãããéæ³ãèµ·ããå Žæã§ããåãã¯ããããã¯ã«ã€ããŠããšã³ã³ãŒããŒã¯ãã¬ãŒã å äºæž¬ãŸãã¯ãã¬ãŒã éäºæž¬ã䜿çšãããã©ãããæ±ºå®ããŸãã
- Iãã¬ãŒã ã®å ŽåïŒã€ã³ãã©äºæž¬ïŒïŒãšã³ã³ãŒããŒã¯ãåããã¬ãŒã å ã®ãã§ã«ãšã³ã³ãŒãããã飿¥ãã¯ã»ã«ïŒäžããã³å·Šã®ãããã¯ïŒã«åºã¥ããŠçŸåšã®ãããã¯ãäºæž¬ããŸããæ¬¡ã«ãäºæž¬ãšå®éã®ãããã¯ã®éã®å°ããªéãïŒæ®å·®ïŒã®ã¿ããšã³ã³ãŒãããå¿ èŠããããŸãã
- Pãã¬ãŒã ãŸãã¯Bãã¬ãŒã ã®å ŽåïŒã€ã³ã¿ãŒäºæž¬ïŒïŒããã¯åãæšå®ã§ãããšã³ã³ãŒããŒã¯ãåç §ãã¬ãŒã å ã§äžèŽãããããã¯ãæ€çŽ¢ããŸããæé©ãªäžèŽãèŠã€ãããšãåããã¯ãã«ïŒããšãã°ããå³ã«10ãã¯ã»ã«ãäžã«2ãã¯ã»ã«ç§»åãïŒãèšé²ããæ®å·®ãèšç®ããŸããå€ãã®å Žåãæ®å·®ã¯ãŒãã«è¿ãããããšã³ã³ãŒãã«å¿ èŠãªãããæ°ã¯ãããããã§ãã
ã¹ããã4ïŒå€æïŒäŸïŒé¢æ£ã³ãµã€ã³å€æ - DCTïŒ
äºæž¬åŸãæ®å·®ãããã¯ããããŸãããã®ãããã¯ã¯ã颿£ã³ãµã€ã³å€æïŒDCTïŒã®ãããªæ°åŠç倿ãä»ããŠå®è¡ãããŸããDCTã¯ããŒã¿èªäœãå§çž®ããŸããããããŒã¿ã®è¡šçŸæ¹æ³ãæ ¹æ¬çã«å€æŽããŸãã空éãã¯ã»ã«å€ãåšæ³¢æ°ä¿æ°ã«å€æããŸããDCTã®éæ³ã¯ãã»ãšãã©ã®èªç¶ãªç»åã§ã¯ãèŠèŠãšãã«ã®ãŒã®ã»ãšãã©ããããã¯ã®å·Šäžé ïŒäœåšæ³¢æåïŒã«ããããããªä¿æ°ã«éäžããæ®ãã®ä¿æ°ïŒé«åšæ³¢ãã€ãºïŒããŒãã«è¿ãããšã§ãã
ã¹ããã5ïŒéåå
ããã¯ããã€ãã©ã€ã³å ã®äž»èŠãªéå¯éã¹ãããã§ãããå質ãšãããã¬ãŒãã®ãã¬ãŒããªããå¶åŸ¡ããããã®éµã§ããDCTä¿æ°ã®å€æããããããã¯ã¯ãéååãããªãã¯ã¹ã§é€ç®ãããçµæã¯æãè¿ãæŽæ°ã«äžžããããŸããéååãããªãã¯ã¹ã¯ãé«åšæ³¢ä¿æ°ã«å¯ŸããŠãã倧ããªå€ãæã¡ããããã®å€ãã广çã«ãŒãã«æŒãã€ã¶ããŸããããã§ã倧éã®ããŒã¿ãç Žæ£ãããŸããéååãã©ã¡ãŒã¿ãé«ãã»ã©ããŒããå€ããªããå§çž®çãé«ããªããèŠèŠçãªå質ãäœäžããŸãïŒå€ãã®å Žåããããã¯ç¶ã®ã¢ãŒãã£ãã¡ã¯ããšããŠèŠãããŸãïŒã
ã¹ããã6ïŒãšã³ããããŒã³ãŒãã£ã³ã°
æçµæ®µéã¯ãå¯éå§çž®ã¹ãããã§ããéååãããä¿æ°ãåããã¯ãã«ãããã³ãã®ä»ã®ã¡ã¿ããŒã¿ãã¹ãã£ã³ããããã€ããªã¹ããªãŒã ã«å€æãããŸããã©ã³ã¬ã³ã°ã¹ãšã³ã³ãŒãã£ã³ã°ïŒRLEïŒããããã³ã³ãŒãã£ã³ã°ããŸãã¯CABACïŒã³ã³ããã¹ãé©å¿å2å€ç®è¡ã³ãŒãã£ã³ã°ïŒã®ãããªããé«åºŠãªææ³ã䜿çšãããŸãããããã®ã¢ã«ãŽãªãºã ã¯ãããé »ç¹ãªã·ã³ãã«ïŒéååã«ãã£ãŠäœæãããå€ãã®ãŒããªã©ïŒã«ã¯çãã³ãŒããå²ãåœãŠãé »åºŠã®äœãã·ã³ãã«ã«ã¯é·ãã³ãŒããå²ãåœãŠãããšã«ãããããŒã¿ã¹ããªãŒã ããæåŸã®ããããçµãåºããŸãã
ãã³ãŒããŒã¯ããããã®ã¹ããããéæ¹åã«å®è¡ããã ãã§ãïŒãšã³ããããŒãã³ãŒãã£ã³ã° -> ééåå -> é倿 -> åãè£å -> ãã¬ãŒã ã®åæ§ç¯ã
Pythonã§ç°¡ç¥åããããããªã³ãŒããã¯ãå®è£ ãã
ããã§ã¯ãçè«ãå®è·µã«ç§»ããŸããããIãã¬ãŒã ãšPãã¬ãŒã ã䜿çšããæè²çšã³ãŒããã¯ãæ§ç¯ããŸããããã¯ãã³ã¢ãã€ãã©ã€ã³ã§ããåãæšå®ãDCTãéååãããã³å¯Ÿå¿ãããã³ãŒãã¹ãããã瀺ããŸãã
å 責äºé ïŒããã¯åŠç¿çšã«èšèšããã*ããã¡ãã®*ã³ãŒããã¯ã§ããæé©åãããŠããããH.264ã«å¹æµããçµæã¯çæãããŸãããç§ãã¡ã®ç®æšã¯ãã¢ã«ãŽãªãºã ãåäœããŠããã®ãèŠãããšã§ãã
åææ¡ä»¶
次ã®Pythonã©ã€ãã©ãªãå¿ èŠã§ããpipã䜿çšããŠã€ã³ã¹ããŒã«ã§ããŸãïŒ
pip install numpy opencv-python scipy
ãããžã§ã¯ãæ§é
ã³ãŒããããã€ãã®ãã¡ã€ã«ã«æŽçããŸãããïŒ
main.pyïŒãšã³ã³ãŒãããã³ãã³ãŒãããã»ã¹ãå®è¡ããããã®ã¡ã€ã³ã¹ã¯ãªãããencoder.pyïŒãšã³ã³ãŒããŒã®ããžãã¯ãå«ãŸããŠããŸããdecoder.pyïŒãã³ãŒããŒã®ããžãã¯ãå«ãŸããŠããŸããutils.pyïŒãããªI/Oããã³å€æã®ããã®ãã«ããŒé¢æ°ã
ããŒã1ïŒã³ã¢ãŠãŒãã£ãªãã£ïŒ`utils.py`ïŒ
DCTãéååãããã³ãããã®é颿°ã®ãã«ããŒé¢æ°ããå§ããŸãããã¬ãŒã ããããã¯ã«åå²ãã颿°ãå¿ èŠã«ãªããŸãã
# utils.py
import numpy as np
from scipy.fftpack import dct, idct
BLOCK_SIZE = 8
# A standard JPEG quantization matrix (scaled for our purposes)
QUANTIZATION_MATRIX = np.array([
[16, 11, 10, 16, 24, 40, 51, 61],
[12, 12, 14, 19, 26, 58, 60, 55],
[14, 13, 16, 24, 40, 57, 69, 56],
[14, 17, 22, 29, 51, 87, 80, 62],
[18, 22, 37, 56, 68, 109, 103, 77],
[24, 35, 55, 64, 81, 104, 113, 92],
[49, 64, 78, 87, 103, 121, 120, 101],
[72, 92, 95, 98, 112, 100, 103, 99]
])
def apply_dct(block):
"""Applies 2D DCT to a block."""
# Center the pixel values around 0
block = block - 128
return dct(dct(block.T, norm='ortho').T, norm='ortho')
def apply_idct(dct_block):
"""Applies 2D Inverse DCT to a block."""
block = idct(idct(dct_block.T, norm='ortho').T, norm='ortho')
# De-center and clip to valid pixel range
return np.round(block + 128).clip(0, 255)
def quantize(dct_block, qp=1):
"""Quantizes a DCT block. qp is a quality parameter."""
return np.round(dct_block / (QUANTIZATION_MATRIX * qp)).astype(int)
def dequantize(quantized_block, qp=1):
"""Dequantizes a block."""
return quantized_block * (QUANTIZATION_MATRIX * qp)
def frame_to_blocks(frame):
"""Splits a frame into 8x8 blocks."""
blocks = []
h, w = frame.shape
for i in range(0, h, BLOCK_SIZE):
for j in range(0, w, BLOCK_SIZE):
blocks.append(frame[i:i+BLOCK_SIZE, j:j+BLOCK_SIZE])
return blocks
def blocks_to_frame(blocks, h, w):
"""Reconstructs a frame from 8x8 blocks."""
frame = np.zeros((h, w), dtype=np.uint8)
k = 0
for i in range(0, h, BLOCK_SIZE):
for j in range(0, w, BLOCK_SIZE):
frame[i:i+BLOCK_SIZE, j:j+BLOCK_SIZE] = blocks[k]
k += 1
return frame
ããŒã2ïŒãšã³ã³ãŒããŒïŒ`encoder.py`ïŒ
ãšã³ã³ãŒããŒã¯æãè€éãªéšåã§ããåãæšå®ã®ããã®åçŽãªãããã¯ãããã³ã°ã¢ã«ãŽãªãºã ãå®è£ ããIãã¬ãŒã ãšPãã¬ãŒã ãåŠçããŸãã
# encoder.py
import numpy as np
from utils import apply_dct, quantize, frame_to_blocks, BLOCK_SIZE
def get_motion_vectors(current_frame, reference_frame, search_range=8):
"""A simple block matching algorithm for motion estimation."""
h, w = current_frame.shape
motion_vectors = []
for i in range(0, h, BLOCK_SIZE):
for j in range(0, w, BLOCK_SIZE):
current_block = current_frame[i:i+BLOCK_SIZE, j:j+BLOCK_SIZE]
best_match_sad = float('inf')
best_match_vector = (0, 0)
# Search in the reference frame
for y in range(-search_range, search_range + 1):
for x in range(-search_range, search_range + 1):
ref_i, ref_j = i + y, j + x
if 0 <= ref_i <= h - BLOCK_SIZE and 0 <= ref_j <= w - BLOCK_SIZE:
ref_block = reference_frame[ref_i:ref_i+BLOCK_SIZE, ref_j:ref_j+BLOCK_SIZE]
sad = np.sum(np.abs(current_block - ref_block))
if sad < best_match_sad:
best_match_sad = sad
best_match_vector = (y, x)
motion_vectors.append(best_match_vector)
return motion_vectors
def encode_iframe(frame, qp=1):
"""Encodes an I-frame."""
h, w = frame.shape
blocks = frame_to_blocks(frame)
quantized_blocks = []
for block in blocks:
dct_block = apply_dct(block.astype(float))
quantized_block = quantize(dct_block, qp)
quantized_blocks.append(quantized_block)
return {'type': 'I', 'h': h, 'w': w, 'data': quantized_blocks, 'qp': qp}
def encode_pframe(current_frame, reference_frame, qp=1):
"""Encodes a P-frame."""
h, w = current_frame.shape
motion_vectors = get_motion_vectors(current_frame, reference_frame)
quantized_residuals = []
k = 0
for i in range(0, h, BLOCK_SIZE):
for j in range(0, w, BLOCK_SIZE):
current_block = current_frame[i:i+BLOCK_SIZE, j:j+BLOCK_SIZE]
mv_y, mv_x = motion_vectors[k]
ref_block = reference_frame[i+mv_y : i+mv_y+BLOCK_SIZE, j+mv_x : j+mv_x+BLOCK_SIZE]
residual = current_block.astype(float) - ref_block.astype(float)
dct_residual = apply_dct(residual)
quantized_residual = quantize(dct_residual, qp)
quantized_residuals.append(quantized_residual)
k += 1
return {'type': 'P', 'motion_vectors': motion_vectors, 'data': quantized_residuals, 'qp': qp}
ããŒã3ïŒãã³ãŒããŒïŒ`decoder.py`ïŒ
ãã³ãŒããŒã¯ããã»ã¹ãéæ¹åã«å®è¡ããŸããPãã¬ãŒã ã®å Žåãä¿åãããåããã¯ãã«ã䜿çšããŠåãè£åãå®è¡ããŸãã
# decoder.py
import numpy as np
from utils import apply_idct, dequantize, blocks_to_frame, BLOCK_SIZE
def decode_iframe(encoded_frame):
"""Decodes an I-frame."""
h, w = encoded_frame['h'], encoded_frame['w']
qp = encoded_frame['qp']
quantized_blocks = encoded_frame['data']
reconstructed_blocks = []
for q_block in quantized_blocks:
dct_block = dequantize(q_block, qp)
block = apply_idct(dct_block)
reconstructed_blocks.append(block.astype(np.uint8))
return blocks_to_frame(reconstructed_blocks, h, w)
def decode_pframe(encoded_frame, reference_frame):
"""Decodes a P-frame using its reference frame."""
h, w = reference_frame.shape
qp = encoded_frame['qp']
motion_vectors = encoded_frame['motion_vectors']
quantized_residuals = encoded_frame['data']
reconstructed_blocks = []
k = 0
for i in range(0, h, BLOCK_SIZE):
for j in range(0, w, BLOCK_SIZE):
# Decode the residual
dct_residual = dequantize(quantized_residuals[k], qp)
residual = apply_idct(dct_residual)
# Perform motion compensation
mv_y, mv_x = motion_vectors[k]
ref_block = reference_frame[i+mv_y : i+mv_y+BLOCK_SIZE, j+mv_x : j+mv_x+BLOCK_SIZE]
# Reconstruct the block
reconstructed_block = (ref_block.astype(float) + residual).clip(0, 255)
reconstructed_blocks.append(reconstructed_block.astype(np.uint8))
k += 1
return blocks_to_frame(reconstructed_blocks, h, w)
ããŒã4ïŒãã¹ãŠããŸãšããïŒ`main.py`ïŒ
ãã®ã¹ã¯ãªããã¯ãããã»ã¹å šäœã調æŽããŸãããããªãèªã¿åãããã¬ãŒã ããšã«ãšã³ã³ãŒããããã³ãŒãããŠæçµçãªåºåãçæããŸãã
# main.py
import cv2
import pickle # For saving/loading our compressed data structure
from encoder import encode_iframe, encode_pframe
from decoder import decode_iframe, decode_pframe
def main(input_path, output_path, compressed_file_path):
cap = cv2.VideoCapture(input_path)
frames = []
while True:
ret, frame = cap.read()
if not ret:
break
# We'll work with the grayscale (luma) channel for simplicity
frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY))
cap.release()
# --- ENCODING --- #
print("Encoding...")
compressed_data = []
reference_frame = None
gop_size = 12 # I-frame every 12 frames
for i, frame in enumerate(frames):
if i % gop_size == 0:
# Encode as I-frame
encoded_frame = encode_iframe(frame, qp=2.5)
compressed_data.append(encoded_frame)
print(f"Encoded frame {i} as I-frame")
else:
# Encode as P-frame
encoded_frame = encode_pframe(frame, reference_frame, qp=2.5)
compressed_data.append(encoded_frame)
print(f"Encoded frame {i} as P-frame")
# The reference for the next P-frame needs to be the *reconstructed* last frame
if encoded_frame['type'] == 'I':
reference_frame = decode_iframe(encoded_frame)
else:
reference_frame = decode_pframe(encoded_frame, reference_frame)
with open(compressed_file_path, 'wb') as f:
pickle.dump(compressed_data, f)
print(f"Compressed data saved to {compressed_file_path}")
# --- DECODING --- #
print("\nDecoding...")
with open(compressed_file_path, 'rb') as f:
loaded_compressed_data = pickle.load(f)
decoded_frames = []
reference_frame = None
for i, encoded_frame in enumerate(loaded_compressed_data):
if encoded_frame['type'] == 'I':
decoded_frame = decode_iframe(encoded_frame)
print(f"Decoded frame {i} (I-frame)")
else:
decoded_frame = decode_pframe(encoded_frame, reference_frame)
print(f"Decoded frame {i} (P-frame)")
decoded_frames.append(decoded_frame)
reference_frame = decoded_frame
# --- WRITING OUTPUT VIDEO --- #
h, w = decoded_frames[0].shape
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, 30.0, (w, h), isColor=False)
for frame in decoded_frames:
out.write(frame)
out.release()
print(f"Decoded video saved to {output_path}")
if __name__ == '__main__':
main('input.mp4', 'output.mp4', 'compressed.bin')
çµæã®åæãšãããªãæ¢æ±
`main.py`ã¹ã¯ãªããã`input.mp4`ãã¡ã€ã«ã§å®è¡ãããšã2ã€ã®ãã¡ã€ã«ãçæãããŸããã«ã¹ã¿ã å§çž®ãããªããŒã¿ãå«ã`compressed.bin`ãšãåæ§æããããããªã§ãã`output.mp4`ã§ããå§çž®çã確èªããã«ã¯ã`input.mp4`ã®ãµã€ãºãš`compressed.bin`ãæ¯èŒããŸããå質ã確èªããã«ã¯ã`output.mp4`ãèŠèŠçã«æ€æ»ããŸããç¹ã«`qp`å€ãé«ãå Žåã¯ãéååã®å€å žçãªå åã§ãããããã¯ç¶ã®ã¢ãŒãã£ãã¡ã¯ãã衚瀺ãããå¯èœæ§ããããŸãã
åè³ªã®æž¬å®ïŒããŒã¯ä¿¡å·å¯Ÿé鳿¯ïŒPSNRïŒ
åæ§æã®åè³ªãæž¬å®ããããã®äžè¬çãªå®¢èгçã¡ããªãã¯ã¯ãPSNRã§ããå ã®ãã¬ãŒã ãšãã³ãŒãããããã¬ãŒã ãæ¯èŒããŸããäžè¬ã«ãPSNRãé«ãã»ã©å質ãè¯ãããšã瀺ããŸãã
import numpy as np
import math
def calculate_psnr(original, compressed):
mse = np.mean((original - compressed) ** 2)
if mse == 0:
return float('inf')
max_pixel = 255.0
psnr = 20 * math.log10(max_pixel / math.sqrt(mse))
return psnr
å¶éäºé ãšæ¬¡ã®ã¹ããã
ç§ãã¡ã®åçŽãªã³ãŒããã¯ã¯çŽ æŽãããã¹ã¿ãŒãã§ãããå®ç§ã«ã¯ã»ã©é ãã§ããçŸå®äžçã®ã³ãŒããã¯ã®é²åãåæ ãããããã€ãã®å¶éäºé ãšæœåšçãªæ¹åç¹ã瀺ããŸãã
- åãæšå®ïŒç§ãã¡ã®ç¶²çŸ çãªæ€çŽ¢ã¯é ããåºæ¬çãªãã®ã§ããå®éã®ã³ãŒããã¯ã¯ãé«åºŠãªéå±€åæ€çŽ¢ã¢ã«ãŽãªãºã ã䜿çšããŠãåããã¯ãã«ãã¯ããã«é«éã«èŠã€ããŸãã
- Bãã¬ãŒã ïŒPãã¬ãŒã ã®ã¿ãå®è£ ããŸãããBãã¬ãŒã ã远å ãããšãè€éããšã¬ã€ãã³ã·ãå¢å ãã代ããã«ãå§çž®å¹çãå€§å¹ ã«åäžããŸãã
- ãšã³ããããŒã³ãŒãã£ã³ã°ïŒé©åãªãšã³ããããŒã³ãŒãã£ã³ã°æ®µéãå®è£ ããŸããã§ãããPythonããŒã¿æ§é ãåçŽã«ãã¯ã«ã¹åããŸãããéååããããŒãã®ã©ã³ã¬ã³ã°ã¹ãšã³ã³ãŒããŒã远å ãããã®åŸã«ãããã³ã³ãŒããŒãŸãã¯ç®è¡ã³ãŒããŒã远å ãããšããã¡ã€ã«ãµã€ãºãããã«çž®å°ãããŸãã
- ãããããã³ã°ãã£ã«ã¿ãŒïŒ8x8ãããã¯éã®ã·ã£ãŒããªãšããžã¯ãç®ã«èŠããã¢ãŒãã£ãã¡ã¯ããåŒãèµ·ãããŸããææ°ã®ã³ãŒããã¯ã¯ãåæ§æåŸã«ãããããã³ã°ãã£ã«ã¿ãŒãé©çšããŠããããã®ãšããžãæ»ããã«ããèŠèŠçãªå質ãåäžãããŸãã
- å¯å€ãããã¯ãµã€ãºïŒææ°ã®ã³ãŒããã¯ã¯ãåºå®ããã16x16ãã¯ããããã¯ã®ã¿ã䜿çšããããã§ã¯ãããŸãããã³ã³ãã³ãã«ããããäžèŽããããã«ããã¬ãŒã ãããŸããŸãªãããã¯ãµã€ãºãšåœ¢ç¶ã«é©å¿çã«åå²ã§ããŸãïŒããšãã°ããã©ãããªé åã«ã¯ãã倧ããªãããã¯ã䜿çšãã詳现ãªé åã«ã¯ããå°ããªãããã¯ã䜿çšãããªã©ïŒã
çµè«
ãããªã³ãŒããã¯ã®æ§ç¯ïŒç°¡ç¥åããããã®ã§ãã£ãŠãïŒã¯ãéåžžã«ããããã®ãããšã¯ãµãµã€ãºã§ããç§ãã¡ã®ããžã¿ã«ã©ã€ãã®å€§éšåãåãããã¯ãããžãŒãè§£æããŸãã空éçããã³æéçåé·æ§ã®ã³ã¢ã³ã³ã»ãããæ ãããšã³ã³ãŒããã€ãã©ã€ã³ïŒäºæž¬ã倿ãããã³éååïŒã®éèŠãªæ®µéããŠã©ãŒã¯ã¹ã«ãŒãããããã®ã¢ã€ãã¢ãPythonã§å®è£ ããŸããã
ããã§æäŸãããã³ãŒãã¯åºçºç¹ã§ãããã²è©ŠããŠã¿ãŠãã ããããããã¯ãµã€ãºãéååãã©ã¡ãŒã¿ïŒ`qp`ïŒããŸãã¯GOPã®é·ãã倿ŽããŠã¿ãŠãã ãããåçŽãªã©ã³ã¬ã³ã°ã¹ãšã³ã³ãŒãã£ã³ã°ã¹ããŒã ãå®è£ ããããBãã¬ãŒã ã®è¿œå ãšãã課é¡ã«åãçµãã§ã¿ãŠãã ãããç©äºãæ§ç¯ããå£ãããšã«ãã£ãŠãç§ãã¡ãåœç¶ã®ããšãšæã£ãŠããã·ãŒã ã¬ã¹ãªãããªäœéšã®èåŸã«ãã嵿工倫ã«å¯Ÿããæ·±ãæè¬ã®å¿µãåŸãããšãã§ããŸãããããªå§çž®ã®äžçã¯åºå€§ã§ãåžžã«é²åããŠãããåŠç¿ãšé©æ°ã®ããã®ç¡éã®æ©äŒãæäŸããŠããŸãã